20. TfIdf It

TfIdf It

Question:

Transform the word_data into a tf-idf matrix using the sklearn TfIdf transformation. Remove english stopwords.

You can access the mapping between words and feature numbers using get_feature_names() , which returns a list of all the words in the vocabulary. How many different words are there?

Start Quiz:

INSTRUCTOR NOTE:

Be sure to use the tf-idf Vectorizer class to transform the word data.

Don't forget to remove english stop words when you set up the vectorizer, using sklearn's stop word list (not NLTK).